We test the performance of GAN models for lip-synchronization. For this, we reimplement LipGAN in Pytorch, train it on the dataset GRID and compare it to our own variation, L1WGAN-GP, adapted to the LipGAN architecture and also trained on GRID.
translated by 谷歌翻译
Large-scale diffusion neural networks represent a substantial milestone in text-to-image generation, but they remain poorly understood, lacking interpretability analyses. In this paper, we perform a text-image attribution analysis on Stable Diffusion, a recently open-sourced model. To produce pixel-level attribution maps, we upscale and aggregate cross-attention word-pixel scores in the denoising subnetwork, naming our method DAAM. We evaluate its correctness by testing its semantic segmentation ability on nouns, as well as its generalized attribution quality on all parts of speech, rated by humans. We then apply DAAM to study the role of syntax in the pixel space, characterizing head--dependent heat map interaction patterns for ten common dependency relations. Finally, we study several semantic phenomena using DAAM, with a focus on feature entanglement, where we find that cohyponyms worsen generation quality and descriptive adjectives attend too broadly. To our knowledge, we are the first to interpret large diffusion models from a visuolinguistic perspective, which enables future lines of research. Our code is at https://github.com/castorini/daam.
translated by 谷歌翻译
基于分解的模型(FMS),例如Distmult,在知识图完成(KGC)任务中享有持久的成功,通常优于图形神经网络(GNNS)。但是,与GNN不同,FMS难以合并节点特征并概括在归纳环境中看不见的节点。我们的工作通过提出重构GNN来弥合FMS和GNN之间的差距。这种新的体系结构借鉴了两种建模范式,以前在很大程度上被认为是不结合的。具体地说,使用消息通讯的形式主义,我们通过将梯度下降程序重新定义为消息传播操作来展示如何将FMS施加为GNN,这构成了我们重构GNN的基础。在众多成熟的KGC基准测试中,我们的重构GNN可以实现与FMS相当的转导性能以及最先进的归纳性能,同时使用较少的参数阶数。
translated by 谷歌翻译
由于缺乏标记的数据和高注释成本,需要域专家,生物医学领域中的关系提取具有挑战性。远处的监督通常用于通过将知识图与原始文本配对,以解决带注释数据的稀缺性。这样的管道容易出现噪声,并且为涵盖大量生物医学概念的规模增加了挑战。我们研究了现有的远覆盖范围远处监督的生物医学关系提取基准,发现培训和测试关系之间的重叠范围从26%到86%。此外,我们注意到这些基准的数据构建过程中的几个不一致,并且在没有火车测试泄漏的情况下,重点是较窄的实体类型之间的相互作用。这项工作提出了更准确的基准MEDDISTANT19,用于远距离覆盖的远距离监督的生物医学关系提取,以解决这些缺点,并通过将MEDLINE摘要与广泛使用的Snomed Snomed临床术语进行对齐。缺乏针对领域特异性语言模型的彻底评估,我们还进行了实验,以验证一般领域关系提取结果与生物医学关系提取。
translated by 谷歌翻译
在动态对抗数据收集(DADC)中,人类的注释者是任务的,找到模型努力预测的示例。已经显示出在达克收集的训练数据上培训的模型在对抗和域外设置方面更加强大,并且对于人类来说更难愚弄。然而,DADC比传统数据收集更耗时,因此每个示例更昂贵。在这项工作中,我们检查我们是否可以保持DADC的优势,而不会遭受额外的成本。为此,我们引入了生成的注释助理(GaAs),生成的循环模型,提供了注释器完全批准,修改或拒绝的实时建议。我们在20个实验设置中收集培训数据集,并对这种方法进行详细分析,用于标准和对冲数据收集的提取问题应答(QA)的任务。我们展示了GaAs在注释速度方面提供了显着的效率效益,同时导致改善模型愚蠢的速率。此外,我们还表明,GaA辅助数据在回答任务的各种问题上导致更高的下游模型性能。
translated by 谷歌翻译
在这项工作中,我们继续建立最近有限马尔可夫进程的钢筋学习的进步。以前现有的算法中的一种共同方法,包括单个演员和分布式,都是剪辑奖励,也可以在Q函数上应用转换方法,以处理真正的折扣回报中的各种大小。理论上我们展示了如果我们有非确定性过程,最成功的方法可能不会产生最佳政策。作为一种解决方案,我们认为分布加强学习借给自己完全解决这种情况。通过引入共轭分布运营商,我们可以处理大量转换,以获得有保证的理论融合。我们提出了一种基于该操作员的近似单录像机算法,该操作员使用Cram \'ER距离给出的适当分布度量直接在不妨碍的奖励上培养代理。在使用粘性动作的35个Atari 2600游戏套件中培训代理的随机环境中的表现,与多巴胺框架中的其他众所周知的算法相比,获得最先进的绩效。
translated by 谷歌翻译
最近的开放式域问题回答表明,新颖的测试问题之间的模型性能和那些在很大程度上与培训问题重叠的模型性能存在很大差异。然而,目前尚不清楚新颖的问题的哪些方面使他们成为挑战。在进行系统泛化的研究时,我们根据三个类别介绍和注释问题,这些类别测量了不同的水平和概括的种类:培训设定重叠,组成泛化(Comp-Gen)和新颖的实体概括(新实体)。在评估六个流行的参数和非参数模型时,我们发现,对于既定的自然问题和TriviaQA数据集,即使是Comp-Gen /新颖实体的最强的模型性能也是13.1 / 5.4%和9.6 / 1.5%,而与此相比降低对于完整的测试集 - 表示这些类型的问题所带来的挑战。此外,我们表明,虽然非参数模型可以相对良好地处理含有新颖实体的问题,但它们与那些需要组成泛化的问题斗争。最后,我们发现关键问题是:来自检索组件的级联错误,问题模式的频率和实体的频率。
translated by 谷歌翻译
Link prediction for knowledge graphs is the task of predicting missing relationships between entities. Previous work on link prediction has focused on shallow, fast models which can scale to large knowledge graphs. However, these models learn less expressive features than deep, multi-layer modelswhich potentially limits performance. In this work we introduce ConvE, a multi-layer convolutional network model for link prediction, and report state-of-the-art results for several established datasets. We also show that the model is highly parameter efficient, yielding the same performance as DistMult and R-GCN with 8x and 17x fewer parameters. Analysis of our model suggests that it is particularly effective at modelling nodes with high indegree -which are common in highlyconnected, complex knowledge graphs such as Freebase and YAGO3. In addition, it has been noted that the WN18 and FB15k datasets suffer from test set leakage, due to inverse relations from the training set being present in the test sethowever, the extent of this issue has so far not been quantified. We find this problem to be severe: a simple rule-based model can achieve state-of-the-art results on both WN18 and FB15k. To ensure that models are evaluated on datasets where simply exploiting inverse relations cannot yield competitive results, we investigate and validate several commonly used datasets -deriving robust variants where necessary. We then perform experiments on these robust datasets for our own and several previously proposed models, and find that ConvE achieves state-of-the-art Mean Reciprocal Rank across most datasets.
translated by 谷歌翻译